Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: report last connection error to RPCs via v1 balancer API #2508

Merged
merged 1 commit into from
Dec 7, 2018

Conversation

jeanbza
Copy link
Member

@jeanbza jeanbza commented Dec 6, 2018

No description provided.

@jeanbza jeanbza changed the title plumb lastConn error in v1 balancer plumb last conn attempt error in v1 balancer Dec 6, 2018
@jeanbza jeanbza requested review from menghanl and dfawley December 6, 2018 23:24
@@ -315,12 +313,12 @@ func (bw *balancerWrapper) Pick(ctx context.Context, opts balancer.PickOptions)
Metadata: a.Metadata,
}]
if !ok && failfast {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@menghanl this is a bug, right? We should return an error regardless of failfast, shouldn't we? Otherwise, we'll be trying to use sc (which will be nil) in the lookup below, and then eventually returning it from the function with no error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning nil will not cause error because we check it. nil will result in a re-pick.

We will only get nil here when things are not in sync (between the wrapper and the v1-balancer). This means the connectivity state has changed, and a new pick was (or will be) updated. So I think re-pick makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we will also get a re-pick if we return the transient failure error immediately. Technically this works as-is, but if an error happens we should be returning an error, not a nil sc with no error.

Returning a nil sc will also result in the info log (which should probably be a warning) "subconn returned from pick is not *acBalancerWrapper", which shouldn't normally be happening (and it puts our implementation details into the user's log messages).

OK, let's go ahead with this PR but I think we should change this separately. I'll send a PR.

@@ -315,12 +313,12 @@ func (bw *balancerWrapper) Pick(ctx context.Context, opts balancer.PickOptions)
Metadata: a.Metadata,
}]
if !ok && failfast {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we will also get a re-pick if we return the transient failure error immediately. Technically this works as-is, but if an error happens we should be returning an error, not a nil sc with no error.

Returning a nil sc will also result in the info log (which should probably be a warning) "subconn returned from pick is not *acBalancerWrapper", which shouldn't normally be happening (and it puts our implementation details into the user's log messages).

OK, let's go ahead with this PR but I think we should change this separately. I'll send a PR.

@jeanbza jeanbza merged commit 4be7750 into grpc:master Dec 7, 2018
@jeanbza jeanbza deleted the plumb_error branch December 7, 2018 17:12
jeanbza added a commit to jeanbza/gax-go that referenced this pull request Dec 7, 2018
Now that grpc/grpc-go#2508 is in, gRPC plumbs the
last connection error when we attempt to make an RPC on a clientconn that has
not been successfully conencted.

So, in this CL, we check errors for a permanent connection problem.
Specifically if certificates are misconfigured, or ca-certificates are missing,
we expect not to be able to establish a connection in a reasonable amount of
time and instead bail out and return the error to the user to fix.

Fixes googleapis/google-cloud-go#1234
@dfawley dfawley changed the title plumb last conn attempt error in v1 balancer client: report last connection error to RPCs via v1 balancer API Dec 13, 2018
@dfawley dfawley added this to the 1.18 Release milestone Dec 13, 2018
jeanbza added a commit to jeanbza/gax-go that referenced this pull request Dec 18, 2018
Now that grpc/grpc-go#2508 is in, gRPC plumbs the
last connection error when we attempt to make an RPC on a clientconn that has
not been successfully conencted.

So, in this CL, we check errors for a permanent connection problem.
Specifically if certificates are misconfigured, or ca-certificates are missing,
we expect not to be able to establish a connection in a reasonable amount of
time and instead bail out and return the error to the user to fix.

Fixes googleapis/google-cloud-go#1234
jeanbza added a commit to jeanbza/gax-go that referenced this pull request Dec 19, 2018
Now that grpc/grpc-go#2508 is in, gRPC plumbs the
last connection error when we attempt to make an RPC on a clientconn that has
not been successfully conencted.

So, in this CL, we check errors for a permanent connection problem.
Specifically if certificates are misconfigured, or ca-certificates are missing,
we expect not to be able to establish a connection in a reasonable amount of
time and instead bail out and return the error to the user to fix.

Fixes googleapis/google-cloud-go#1234
jeanbza added a commit to googleapis/gax-go that referenced this pull request Dec 19, 2018
Now that grpc/grpc-go#2508 is in, gRPC plumbs the
last connection error when we attempt to make an RPC on a clientconn that has
not been successfully conencted.

So, in this CL, we check errors for a permanent connection problem.
Specifically if certificates are misconfigured, or ca-certificates are missing,
we expect not to be able to establish a connection in a reasonable amount of
time and instead bail out and return the error to the user to fix.

Fixes googleapis/google-cloud-go#1234
@lock lock bot locked as resolved and limited conversation to collaborators Jun 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants